ILQUA at TREC 2006
نویسندگان
چکیده
This year, we made changes to the passage/sentence retrieval component of ILQUA in handling factoid and list questions. All the other components remain same. ILQUA is an IE-driven QA system. To answer “Factoid” and “List” questions, we apply our answer extraction methods on NE-tagged passages or sentences. The answer extraction methods adopted here are surface text pattern matching, n-gram proximity search, and syntactic dependency matching. Although surface text pattern matching has been applied in some previous TREC QA systems, the patterns used in ILQUA are better since they are automatically generated by a supervised learning system and represented in a format of regular expressions which contain multiple question terms. In addition to surface pattern matching, we also adopt n-gram proximity search and syntactic dependency matching. N-grams of question terms are matched around every named entity in the candidate sentences or passages and a list of named entities are generated as answer candidate. These named entities then go through a multi-level syntactic dependency matching component until a final answer is generated. To answer “Other” questions, we parsed the answer sentences of “Other” questions in previous main task and built syntactic patterns combined with semantic features. These patterns are later applied to the parsed candidate sentences to extract answers of “Other” questions. Figure 1 shows the diagram of the ILQUA architecture.
منابع مشابه
ILQUA--An IE-Driven Question Answering System
ILQUA first participated in TREC QA main task in 2003. This year we have made modifications to the system by removing some components with poor performance and enhanced the system with new methods and new components. The newly built ILQUA is an IE-driven QA system. To answer “Factoid” and “List” questions, we apply our answer extraction methods on NE-tagged passages. The answer extraction metho...
متن کاملQuestioning Answering By Pattern Matching, Web-Proofing, Semantic Form Proofing
In this paper, we introduce the University at Albany’s question answering system, ILQUA. It is developed on the following methods: pattern matching over annotated text, web-proofing and semantic form proofing. These methods are currently used in other QA systems, however, we revised them to work together in our QA system.
متن کاملUAlbany's ILQUA at TREC 2007
1 Overview 1 TREC2007 QA track introduced a combined collection of 175GB BLOG data and 2.5GB news-wire data. This introduced an additional challenge for an automatic QA system to processes data in different formats without sacrificing the accuracy. In ILQUA we added a data preprocessing component to filter out noisy blog data. ILQUA has been built as an IE-driven QA system ; it extracts answers...
متن کاملOverview of the TREC 2006
The fifteenth Text REtrieval Conference, TREC 2006, was held at the National Institute of Standards and Technology (NIST) 14 to 17 November 2006. The conference was co-sponsored by NIST and the Disruptive Technology Office (DTO). TREC 2006 had 107 participating groups from 17 different countries. Table 2 at the end of the paper lists the participating groups. TREC 2006 is the latest in a series...
متن کاملMG4J at TREC 2006
MG4J participated in the ad hoc task of the Terabyte Track (find all the relevant documents with high precision from 25.2 million pages from the .gov domain) at TREC 2006. It was the second time the MG4J group participated to TREC. For this year, we integrated standard techniques (such as stemming and BM25 scoring) into MG4J, and submitted also automatic runs based on trivial query expansion te...
متن کامل